Introduction

Addax (Addax nasomaculatus) are considered one of the rarest antelopes on earth, with an estimated population of <100 individuals in the wild. We assessed the distribution and occurrence of this species from field surveys collected by the Sahara Conservation Fund over a 7-year period (2008-2014). Our results provide insight into the factors contributing to species occurrence and are guiding field surveys to areas that have the potential to support small and geographically isolated populations. We incorporated field-derived variables of vegetation cover with remote sensing measures of vegetation productivity (NDVI - Normalized Difference Vegetation Index) and surface roughness (derived from SRTM - Shuttle Radar Topography Mission). Models were fit in a generalized linear regression framework to evaluate and predict addax occurrence.

Addax moving across shifting sands in the Tin Toumma desert, Niger (Photo: T.Rabeil, SCF)

Lab Excercise

In this excercise, we will follow steps detailed in:

Stabach, J.A., T. Rabeil, V. Turmine, T. Wacher, T. Mueller, and P. Leimgruber. 2017. On the brink of extinction - Habitat selection of addax and dorcas gazelle across the Tin Toumma desert, Niger. Diversity and Distributions 23:581-591.

We will use R for all analyses to create a species distribution model, highlighting the variables important in predicting addax occurrence. In this exercise (part 1 of 2), we will take the first steps to providing a predictive surface of addax habitat suitability. This will include:

  • Loading and processing available spatial data
  • Creating a proxy for surface roughness
  • Relating occurrence (extract) with remotely sensed data

Import & Process Spatial Data

One of the most difficult (and perhaps most time consuming) aspects of conducting a spatial analysis is the construction of the spatial database. Main components include locating and downloading available spatial data, mosaicking (if necessary) multiple tiles together, reprojecting the data so that all layers have the same coordinate reference system (CRS), and resampling the cell sizes to match.

Multiple sources now exist for finding available spatial data, which include EarthExplorer and Google Earth Engine. Some layers, however, can also be directly accessed within R using the raster, Dismo and MODIS packages (among others).

Load Packages

Like other exercises in R, we will first load the necessary packages to perform the analyses. Use help() for any operations that you don’t understand or that require additional information.

# Remove objects from memory
rm(list=ls())

# Load required libraries
library(raster)
library(rgdal)
library(sp)

Occurrence Data

First, we will load the spatial points dataframe (our sampling point locations). This is the occurrence dataset collected by partners in Niger. In this case, I have saved the file as a shapefile (Lat/Long, WGS84) that we will load directly into R. This file could just as easily be loaded as a .csv and then converted to a spatial object by defining the projection.

# Load the point data and project to UTM32N
Addax <- readOGR(dsn="Data", layer="Occurrence_dd")
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Jared\GitHub\Addax_DistributionModel\Data", layer: "Occurrence_dd"
## with 661 features
## It has 12 fields
## Integer64 fields read as strings:  Pt_ID Id_1
Addax
## class       : SpatialPointsDataFrame 
## features    : 661 
## extent      : 11.71088, 12.27343, 16.01946, 16.95161  (xmin, xmax, ymin, ymax)
## crs         : +proj=longlat +datum=WGS84 +no_defs 
## variables   : 12
## names       : Pt_ID,       Date, YearMonth, Id_1,           X,            Y, Addax, Dorcas, Cornul, Stipa1, Stipa2, Human 
## min values  :   109, 2008-12-06,    200812,    1, 789521.0073, 1773471.7647,     0,      0,      0,      0,      0,     0 
## max values  :    84, 2014-12-05,    201412,    8,  850099.989, 1876865.5947,   166,    136,      1,      1,      1,    17
# Project to UTM32N, WGS84
# See: https://spatialreference.org/
UTM32N.proj <- "+proj=utm +zone=32 +ellps=WGS84 +units=m +no_defs"
Addax <- spTransform(Addax, CRS = UTM32N.proj) # Note that I am overwriting the original Lat/Long object

# Look at the file
head(Addax)
##   Pt_ID       Date YearMonth Id_1        X       Y Addax Dorcas Cornul Stipa1
## 1    13 2008-12-06    200812   23 825358.1 1773472     0      3      0      1
## 2    14 2008-12-06    200812   23 830041.8 1776055     0      5      0      0
## 3    15 2008-12-06    200812   23 835354.5 1779007     0      7      0      0
## 4    17 2008-12-06    200812   23 840739.0 1781912     0      2      0      1
## 5    19 2008-12-06    200812   23 845644.1 1784587     0      1      0      0
## 6    21 2008-12-06    200812   23 850100.0 1787353     0     11      0      0
##   Stipa2 Human
## 1      0     0
## 2      0     0
## 3      0     1
## 4      0     0
## 5      0     1
## 6      1     0
# Is the file projected?
projection(Addax)
## [1] "+proj=utm +zone=32 +ellps=WGS84 +units=m +no_defs"
# Plot - Not much context, but we at least can verify that the file loaded correctly
# We'll overlay on some base layers
#plot(Addax)

Raster Data

R has a number of options for directly importing remotely sensed data into the console. Here, we will import a NDVI data layer (a .tif file) that I downloaded from EarthExplorer. This layer is located in your data directory and has already been re-projected to UTM 32N WGS84 and clipped to the study area. We will also use functions in the raster package to load a 90-m elevation dataset. Both layers must exactly match each other (same CRS, extent, grid cell size) in order to predict with the rasters.

# NDVI - Normalized Difference Vegetation Index (250-m)
# ********************************************
# ********************************************
# Note the file is already projected to UTM32N WGS84
ndvi <- raster("Data/MOD13Q1_Nov2017.tif")
ndvi <- ndvi * 0.0001 # Convert to decimal values (This is done to reduce the file size)
ndvi
## class      : RasterLayer 
## dimensions : 421, 376, 158296  (nrow, ncol, ncell)
## resolution : 250, 250  (x, y)
## extent     : 760705.1, 854705.1, 1771859, 1877109  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=32 +datum=WGS84 +units=m +no_defs 
## source     : memory
## names      : MOD13Q1_Nov2017 
## values     : 0.0324, 0.1276  (min, max)
# SRTM - Shuttle Radar Topography Mission (90-m)
# ********************************************
# ********************************************
srtm <- getData('SRTM', lon=12, lat=16.5) # Inputting general coordinates to grab tile
srtm
## class      : RasterLayer 
## dimensions : 6000, 6000, 3.6e+07  (nrow, ncol, ncell)
## resolution : 0.0008333333, 0.0008333333  (x, y)
## extent     : 10, 15, 15, 20  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : C:/Jared/GitHub/Addax_DistributionModel/srtm_39_09.tif 
## names      : srtm_39_09 
## values     : -32768, 32767  (min, max)
# Clip (crop) the raster to the study area extent. This will help with processing the image
# Read in a study area boundary (also in Lat/Long).
SA <- readOGR(dsn="Data", layer="StudyArea")
## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Jared\GitHub\Addax_DistributionModel\Data", layer: "StudyArea"
## with 1 features
## It has 1 fields
plot(srtm)
plot(SA, add=T)

# Crop the raster to the study area extent
# You could also manually set the extent, if inclined to do so (SA <- extent(11,13,16,17))
srtm <- crop(srtm,SA)

# Note: In many cases, you might need to download multiple tiles to cover your study area.  To put them together, you'd need to use the mosaic function.  Be careful, rasters can get large.
#srtm1 <- getData('SRTM', lon = 12, lat = 16.5)
#srtm2 <- getData('SRTM', lon = 15, lat = 16.5)
#srtm.mosaic <- mosaic(srtm1,srtm2, fun=mean) # Using mean for overlapping region

# Multiple other options are available with the getData command (Worldclim, GADM, and future climate data - CMIP5)
# Worldclim:
  # Ramiro will be going over the bioclim variables in the next few weeks

# Global Administrative Boundaries (GADM):
# Sometimes very useful to verify you are working where you think you are
Niger <- getData('GADM', country = "Niger", level = 0) 

# Plot together - Overlapping where I think they should!
plot(Niger)
plot(srtm, add=T)

Raster Project & Resample

You’ll note that the projection of the NDVI layer is different than the SRTM data layer. We need to project the SRTM data layer to UTM32N WGS, making sure the extents, cell sizes (essential for final prediction), and grids all match. We’ll use bilinear interpolation because the data are continuous. For a categorical image (i.e., land-cover classification), use nearest neighbor. We also now have a choice to make about the resolution that we will conduct our analyses. What resolution should we use?

# Project the srtm image, setting the resolution
srtm.utm <- projectRaster(srtm, res = 90, crs = UTM32N.proj, method = 'bilinear') # This can be a slow process, depending on the size of the raster

# Resample the srtm image to the ndvi image
# The native resolution of our analysis will then be 250m - This will help to speed up the processing, but means we are losing some of the fine-scale information contained in the 90m elevation model which might be important
srtm.250m <- resample(srtm.utm, ndvi, "bilinear")

# Evaluate if both rasters have the same extent, rows/columns, projection, and resolution
compareRaster(srtm.250m, ndvi)
## [1] TRUE
# You might need to `crop` the image, if the extents don't match
#ext <- extent(ndvi) # Smaller file
#example.crop <- crop(srtm.250m, ext)

Questions:

  1. Are these files projected? How would you check?
  2. What is the R function for re-projecting raster and/or spatial point data files if you needed to?
  3. What should you always do when importing spatial data (also good practice for any data you load into R)?

Plotting Data

Now that we have projected and processed the spatial data, let’s make a plot of all the data to make sure everything looks good. Data are located where I expect them to be. Amazing!

# Plot the result (order matters)
plot(srtm.250m)
plot(Addax,pch=15,cex=0.7,add=TRUE)

Generate Terrain Variables

We assumed that addax would select inter-dunal depressions. Instead of including the raw SRTM (elevation) data in our models, we created a variable derived from the elevation dataset, termed Surface Roughness. This variable is the change in elevation range between neighboring pixels. That is, it is the difference between the minimum and maximum values of a cell and its eight surrounding neighbors. Pixels with similar neighborhood values have a low change in elevation range (e.g., flat areas), whereas highly variables neighborhoods have a large change in elevation range (e.g., steep slopes) (Wilson et al. 2007).

Terrain variables can be easily applied to digital elevation models. Variables that can be calculated include slope, aspect, Terrain Position Index (TPI), Terrain Ruggedness Index (TRI), and surface roughness. Some of these variables are likely to be highly correlated. Use help(terrain) for additional information. In this example, we will calculate surface roughness.

rough <- terrain(srtm.250m,opt='roughness')
plot(rough)
plot(Addax,pch=15,cex=0.7,add=TRUE)

Raster Extraction

The extract command is an extremely useful and powerful function for joining spatial objects. Here, we will extract surface roughness and NDVI at each of the occurrence locations. If we stack the layers together, we can extract all the rasters together with a single command. See help(extract) for additional information about this function.

r.stack <- stack(ndvi,rough)
#plot(r.stack)

# Extract NDVI and rough  
Extract <- extract(r.stack,Addax)
colnames(Extract) <- c("ndvi","ROUGH")

# Append to spatialobject
Addax <- cbind(Addax,Extract) 

# Look at the data
head(Addax)
##   Pt_ID       Date YearMonth Id_1        X       Y Addax Dorcas Cornul Stipa1
## 1    13 2008-12-06    200812   23 825358.1 1773472     0      3      0      1
## 2    14 2008-12-06    200812   23 830041.8 1776055     0      5      0      0
## 3    15 2008-12-06    200812   23 835354.5 1779007     0      7      0      0
## 4    17 2008-12-06    200812   23 840739.0 1781912     0      2      0      1
## 5    19 2008-12-06    200812   23 845644.1 1784587     0      1      0      0
## 6    21 2008-12-06    200812   23 850100.0 1787353     0     11      0      0
##   Stipa2 Human   ndvi    ROUGH
## 1      0     0 0.0956 19.56319
## 2      0     0 0.0882 10.59894
## 3      0     1 0.0854 10.50461
## 4      0     0 0.0920 19.73361
## 5      0     1 0.0938 17.06398
## 6      1     0 0.0878 23.42404

Note: I have simplified things a bit here because we have a time series of points and therefore, we must download and extract a time series of NDVI. Each of the dates need to be aligned with their corresponding NDVI image. In the next exercise, the dataset provided will already have this step completed so that we can focus on the modeling.